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Abstract 

Background: The caleosin genes encode proteins with a single conserved EF hand calcium-binding domain and 
comprise small gene families found in a wide range of plant species. Some members of the gene family have been 
shown to be upregulated by environmental stresses including low water availability and high salinity. Caleosin 3 
from wheat has been shown to interact with the a-subunit of the heterotrimeric G proteins, and to act as a GTPase 
activating protein (GAP). This study characterizes the size and diversity of the gene family in wheat and related 
species and characterizes the differential tissue-specific expression of members of the gene family. 

Results: A total of 34 gene family members that belong to eleven paralogous groups of caleosins were identified 
in the hexaploid bread wheat, T. aestivum. Each group was represented by three homeologous copies of the gene 
located on corresponding homeologous chromosomes, except the caleosin 10, which has four gene copies. Ten 
gene family members were identified in diploid barley, Hordeum vulgare, and in rye, Secale cereale, seven in 
Brachypodium distachyon, and six in rice, Oryza sativa. The analysis of gene expression was assayed in triticale and 
rye by RNA-Seq analysis of 454 sequence sets and members of the gene family were found to have diverse patterns 
of gene expression in the different tissues that were sampled in rye and in triticale, the hybrid hexaploid species 
derived from wheat and rye. Expression of the gene family in wheat and barley was also previously determined by 
microarray analysis, and changes in expression during development and in response to environmental stresses are 
presented. 

Conclusions: The caleosin gene family had a greater degree of expansion in the Triticeae than in the other 
monocot species, Brochypodium and rice. The prior implication of one member of the gene family in the stress 
response and heterotrimeric G protein signaling, points to the potential importance of the caleosin gene family. 
The complexity of the family and differential expression in various tissues and under conditions of abiotic stress 
suggests the possibility that caleosin family members may play diverse roles in signaling and development that 
warrants further investigation. 

Keywords: Caleosin gene family. Calcium-binding protein, Phylogenetic analysis. Tissue-specific expression, GAP, 
Ga, Heterotrimeric G protein signaling, RNA-seq 



Background 

Caleosins are calcium-binding proteins encoded by small 
gene families in plants, and some members of the gene 
family have been shown to play an important role in sig- 
naling and in the response to stress. Ta- C/o3 encoded by a 
member of this gene family in wheat {Triticum aestivum), 
was shown to have GAP activity with the heterotrimeric G 
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protein subunit, Ga [1]. The caleosins do not have signifi- 
cant sequence similarity with the Regulators of G protein 
Signaling (RGS) in other species and appear to be a new 
class of proteins that act as heterotrimeric G protein 
GAPs. The wheat Clo3 was also shown to interact with 
phosphoinositide-specific phospholipase CI (PI-PLCl) 
in vivo and in vitro, and the interaction between Clo3, Ga 
and PI-PLCl was found to be competitive [1]. A homolog 
of Ta-C/o3 in Arabidopsis, At- C/o3, also known as Re- 
sponsive to Dehydration 20 (RD20), has been shown to be 
strongly induced by drought, abscisic acid and high salin- 
ity and was experimentally shown to bind calcium [2]. 
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The clo3lrd20 mutants in Arabidopsis showed enhanced 
sensitivity to drought conditions and RD20 was implicated 
in the control of stomata aperture, reduction of growth, 
and increased transpirational efficiency [3]. Plants' re- 
sponses to water deficit are known to activate signal trans- 
duction cascades that increase the level of secondary 
messengers, including calcium, thus some members of the 
caleosin gene family appear to play a critical role in water 
deficit signaling and to link calcium regulation to G pro- 
tein signaling. Analysis of caleosins in barley also suggests 
a role in lipid trafficking and membrane expansion [4]. 
The caleosin assembled oil bodies have been proposed 
as useful components of a nano-carrier for therapeutic 
purposes, and have been specifically used as drug car- 
riers, targeting cancer cells [5]. It is unknown if the role 
of caleosins in the stress response is related to their role 
in lipid bodies, or if they are simply different functions 
carried out by different members of the gene families. 
Caleosins comprise a gene family of seven members in 
Arabidopsis and the rice genome contains five gene family 
members. 

Bread wheat, Triticum aestivum, is one of the most im- 
portant cereal species grown world-wide. It is an allohexa- 
ploid, derived from two polyploidization events. The first 
hybridization between the diploid T, urartu and an un- 
known species thought to be closely related to Aegilops 
speltoideSy which contributed the A and B genomes, re- 
spectively, occurred approximately 500,000 years ago. The 
tetraploid species was domesticated as T, turgidum, com- 
monly known as pasta wheat. The second polyploidization 
occurred between T. turgidum and Aegilops tauchiU the D 
genome progenitor, approximately 8,000 years ago. Hor- 
deum vulgare, barley, and Secale cereale, rye, are closely 
related crop species that belong to the tribe Triticeae, esti- 
mated to have diverged from the Triticum-Aegilops 
lineage 11 and 7 MYA, respectively [6]. Hexaploid triticale 
(x Triticosecale) is a synthetic hybrid crop species first de- 
veloped in the late 19^^ century by crosses between Secale 
cereale and T, turgidum, and contains A, B, and R ge- 
nomes [7]. Triticale, grown largely for livestock feed, as 
well as for human consumption, has an important poten- 
tial as a crop, especially under conditions that are less fa- 
vorable for wheat cultivation, such as marginal soils in 
drought-prone regions. Generally, it combines the grain 
quality and yield potential of wheat with the environmen- 
tal stress and disease tolerance of rye [7]. Compared to 
wheat, triticale appears to have a higher resistance to 
many wheat fungal diseases and pests, as well as viral dis- 
eases. In addition to tolerance to conditions of drought, 
triticale varieties are able to adapt to stress conditions 
such as excess moisture and acidic soils [7]. Triticale is 
also an important model for investigation of the rapid 
changes involving genomic remodeling and changes in 
gene expression subsequent to polyploidization. 



In order to facilitate the analysis of the members of 
the caleosin gene family and investigate the diverse roles 
these proteins may play in signaling, we report a descrip- 
tion of the whole gene family in hexaploid wheat, diploid 
rye, and triticale, based on analysis of high-throughput 
cDNA sequencing data sets and compare these to the 
other diploid species including barley {Hordeum vulgare), 
Brachypodium distachyon, rice, and Arabidopsis. 

Methods 

Caleosin contigs assembly 
Triticum aestivum 

The calcium binding protein Ta-Clo3/J900 from wheat 
was used for BLASTp [8] searches in the GenBank NR 
databases for related caleosin gene sequences in Arabi- 
dopsis and rice. The complete set of Arabidopsis and 
rice caleosin amino acid sequences, as well as that of the 
Ta-Clo3 were then used to search, with tBLASTn, for re- 
lated sequences in the GenBank EST database for T. aes- 
tivum. The EST sequences were assembled into contigs 
using CAP3 [9] under high stringency parameters of a 
minimum sequence identity of 98%; minimum overlap 
length of 40 nt; gap penalty, 6; match value, 2; mismatch 
penalty (-5). Open reading frames and translation of the 
contigs were carried out with the ExPASy translate tool 
[10] and confirmed by BLASTx with the GenBank NR 
database comparison to related sequences, as well as by 
comparison among the T. aestivum caleosin sequences 
as the data set was developed. Contigs were manually 
edited to obtain full length cDNA sequences by first 
identifying any partial length contigs by BLASTx to the 
NR database and then identify additional ESTs with 
100% identity and a minimum overlap of 120 nt in the 
T, aestivum EST database at NCBI. ESTs were selected 
which could extend the 5 ' or 3 ' end of the initial contig. 
CAP3 was used to assemble all ESTs together with par- 
tial sequences to generate full-length contigs. After an 
initial set of contigs were completed, the process was re- 
iterated to identify additional ESTs and additional home- 
ologous members of the gene family that were not 
represented in the initial contig set. The T, aestivum 
caleosin contigs were used to search the Wheat Survey 
Sequences (WSS) of the International Wheat Genome 
Sequencing Consortium [11] in order to identify the 
chromosome or chromosome arm on which the gene 
was located. One cDNA, Clo4-B, not represented in the 
T, aestivum GenBank EST database, was assembled in- 
dependently from 454 (Roche) cDNA sequences of triti- 
cale and from matches of the homeologous Clo4-A to 
the WSS genomic wheat sequence database. 

Secale cereale 

Eleven paralogous wheat caleosin genes were used to 
search for orthologs in five 454-cDNA rye libraries 
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obtained from anther, pistil, crown, roots, and stem, 
using BLASTn. All 454-cDNAs that matched to wheat 
caleosin genes with a minimum overlap of 100 nt were se- 
lected as caleosin homologs regardless of their BLASTn 
scores or percent identities. The selected candidates of rye 
454-cDNAs were assembled using CAPS with the same 
parameters as described above for T, aestivum, except the 
minimum overlap length was set at 35 nt. Using these 
high stringency assembly parameters, the CAPS assembly 
yielded 45 contigs. The preliminary set of contigs was 
searched with the eleven wheat paralogs using BLASTn to 
eliminate duplicates and to select contigs with the longest 
contig length and highest sequence similarity. Contigs as- 
sembled with a low depth of coverage were individually 
compared to the rye 454-cDNAs to confirm the accuracy 
of their assemblies. 

Hordeum vulgare 

The 11 paralogous caleosin gene sequences of wheat were 
used to search in the GeneBank databases for homologous 
genes in barley {H. vulgare), the best characterized mem- 
ber of the Triticeae. Gene sequences were identified for 
eight of these genes in the nr database of full length nu- 
cleotide sequences. The sequence of Hv-C/o9 [GenBank: 
AK375872.1] had a frame-shift error that became appar- 
ent by comparison to the wheat orthologs; the sequences 
were corrected by comparison to sequences in the Gen- 
Bank barley EST database. An additional barley caleosin, 
Hv-C/o8, was assembled from EST sequences in the Gen- 
Bank EST database. Another contig, Hv-Cloll was identi- 
fied in the second generation sequence database for barley 
[12] by a BLASTn search of the assembly 1 mo rex rcba 
database. In the latter case, the tentative barley transcript 
was manually assembled from the output of the BLASTn 
alignment of the barley subject sequence detected by the 
wheat Ta-C/o77 caleosin query sequence. 

Brachypodium distachyon 

T, aestivum caleosin protein sequences were used to 
search the B, distachyon database [13] by tBLASTn. The 
complete caleosin gene sequence and coding sequences 
were acquired in PASTA format and were translated 
with the ExPASy Translate tool [10]. In cases where the 
original annotation appears to have misidentified the 
exon/intron junctions or start codons, the flanking se- 
quence for the annotated gene region was reanalyzed and 
annotated manually by searching for extended ORFs and 
sequence similarity to the caleosin protein sequences from 
T, aestivum, 

Caleosin genes conserved domains and family 
phylogenetic tree 

The most conserved region of the gene family was deter- 
mined by using NCBI Batch Conserved Domain Search 



Tool [14] for all contigs, and the result was confirmed 
with conservation scores calculated by JasView. The cal- 
cium binding domain EF-hand motif was identified by 
alignment of all contigs with EF-hand motifs (1X05, 
INYA) and calcium binding proteins (ITIZ, INKF, 
30X6) obtained from the Protein Data Bank [15]. The 
results were verified based on the EF-hand motif in Ara- 
bidopsis, described by Takahashi and coworkers [2]. The 
structural and functional features of caleosin genes were 
examined using InterPro Scan Sequence Search [16], 
and the Simple Modular Architecture Research Tool 
(SMART) [17] in genomic mode. These two programs 
were used in parallel to support the result from the NCBI 
Search Tool and to verif)^ the presence of the EF-hand 
motif. 

A phylogenetic tree was constructed using wheat caleo- 
sin nucleotide sequences aligned using Molecular Evolu- 
tionary Genetics Analysis (MEGA), version 5.1 [18]. The 
maximum likelihood method was employed based on the 
Jukes-Cantor model [19]. Initial tree(s) for the heuristic 
search were obtained automatically as follows: When the 
number of common sites was < 100 or less than one 
fourth of the total number of sites, the maximum parsi- 
mony method was used; otherwise BIONJ method with 
MCL distance matrix was used. A discrete Gamma distri- 
bution was used to model evolutionary rate differences 
among sites (5 categories (+G, parameter = 1.2379)). The 
analysis involved 35 nucleotide sequences. All positions 
containing gaps and missing data were eliminated. There 
were a total of 510 positions in the final dataset. 

Multiple sequence alignment and phylogenetic tree con- 
struction for caleosins from six species were performed 
using MEGA, version 5.1 [18]. Sequence comparisons were 
based on the 174 amino acid caleosin domain of protein 
sequences from eleven paralogous Clo genes (one repre- 
sentative of each homeologous gene groups) of hexaploid 
T, aestivum, and the gene sequences from the diploid spe- 
cies: H. vulgare, B. distachyon and S, cereale as well as 
those of O. sativa and A, thaliana. The amino acid se- 
quences from the six species were aligned using MUSCLE 
[20]. Phylogenetic trees were constructed using the max- 
imum likelihood method based on the Whelan and Gold- 
man (WAG) model [21]. This method uses standard 
statistical techniques for inferring probability distributions 
to assign probabilities to possible phylogenetic trees. The 
WAG model for amino acids was employed [21], an em- 
pirical model of globular protein evolution. Initial tree(s) 
for the heuristic search were obtained automatically as de- 
scribed above. A discrete Gamma distribution was used to 
model evolutionary rate differences among sites (5 cat- 
egories (+G, parameter = 1.0411)). The analysis involved 
54 amino acid sequences. All positions containing gaps 
and missing data were eliminated. There were a total of 
93 amino acid positions included in the final dataset. 
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Intron/exon structure of caleosin genes 

The intron/exon structure of the wheat caleosin genes 
was determined by comparing the full length cDNA se- 
quence to the genomic sequence in the IWGSCs WSS 
database [11]. The intron/exon structure of H, vulgare 
and Brachypodium caleosin genes was similarly charac- 
terized by comparing the full length cDNA sequences to 
genomic sequence contigs at the Gramene database [22] . 

Rye and triticale 454-cDNA library construction 
Plant material and growth conditions 

Rye {Secale cereale) and hexaploid triticale {x Triticose- 
cale Wittm.) seedlings were grown in 15 cm diameter 
plastic pots containing soil-less mixture (Cornell mix) in 
a temperature-controlled growth chamber maintained at 
20°C (day), and 18°C (night) under a photoperiod of 
16 h light (250-275 (lE m"^ s"^) provided by fluorescent 
and incandescent light. Plants were held at a constant 
humidity of 70% and watered daily. Specific cultivars 
grown, and the tissues that were harvested at specific 
developmental stages are listed in Additional file 1: Table 
SI and Additional file 2: Table S2. When tissue was har- 
vested it was frozen immediately in liquid nitrogen. 
Floral tissues from triticale and rye were harvested from 
plants grown as described by Tran et al. [23] at different 
Zadoks developmental stages [24]. For salt treatment 
analysis, rye cultivar Musketeer and triticale cultivar AC 
Certa plants were grown in hydroponic tanks containing 
a modified Hoaglands solution [25] with a light cycle of 
16 h light and 8 h darkness, with day/night temperatures 
of 22°C and 15°C, respectively. The growth solution was 
replaced at days seven and 14; at day 21 the hydroponic 
solution was replaced with fresh growth solution supple- 
mented with 100 mM NaCl and 14 mM CaCls, treated 
for 24 h and harvested. For the polyethylene glycerol 
(PEG) treatment, three day old germinated seeds were 
placed on the surface of an artificial media (50 ml) con- 
taining 0%, 27%, 31% or 34% PEG 35,000 in Magenta 
boxes (2 seedlings/box) [26], kept in the growth chamber 
30 cm beneath 40 watt Sylvania Gro-Lux Wide Spectrum 
lamps delivering 80 (iM of light with a 16 h light period at 
18°C and grown for 21 days. 

DNJ\ library and sequencing 

Construction for the standard cDNA libraries, 0.6 mg total 
RNA was used to purif)^ Poly(A)^ mRNA using Poly (A) 
Purist™ Kit (Ambion, Inc). First strand cDNA synthesis 
was initiated by an anchored poly (T) and Superscript III. 
Then, the second strand of cDNA was made using Invitro- 
gen reagents. 

For the anther libraries [23], total RNA was extracted 
from rye anthers (200 mg) using the Concert™ Plant 
RNA Reagent (Invitrogen, Burlington, ON, Canada) ac- 
cording to the manufacturers instructions. The total 



RNA was further purified using the RNeasy kit (Qiagen, 
Mississauga, ON, Canada) following the manufacturers 
instructions. Poly A^ RNA was isolated from 50 [ig total 
RNA using Dynabeads (Invitrogen, Burlington, ON, 
Canada) in accordance with the manufacturer s instruc- 
tions. Rye anther cDNA was generated using approxi- 
mately 200 ng of poly A^ RNA and the SMARTer™ PGR 
cDNA synthesis kit (Clontech, Mountain View, CA, 
USA). The resulting cDNA was PGR amplified for 15 cy- 
cles. The PGR amplified cDNA was purified using the 
MinElute Reaction Kit (Qiagen, Mississauga, ON, 
Canada) and used as a template for 454 sequencing. Five 
micrograms of ds cDNA from the different libraries were 
sent to the Plant Biotechnology Institute, Saskatoon, SK, 
Canada, for 454 GS FLX Titanium sequencing. Root 
cDNA libraries from the salt stress experiments were se- 
quenced using the same technology at Genome Quebec 
Innovation Centre, Montreal, PQ, Canada. The number 
of replicate libraries for each tissue ranged between 6 
and 2 and are listed in Additional file 1: Table SI and 
Additional file 2: Table S2. 

Caleosin expression analysis 

The relative level of expression of caleosin gene family 
members in rye and in triticale was determined by analysis 
of Roche 454-cDNA sequence libraries. A description of 
the analysis is presented as a flowchart (Additional file 3: 
Figure SI). 454-cDNA reads were converted from 454-sff 
format to FASTQ format using Galaxy server from Penn 
State and Emory University [27]. The high quality tran- 
scripts obtained from triticale and rye tissues were aligned 
to their associated wheat, rye, and triticale caleosin refer- 
ence sequences using the CD-HIT-EST-2D biological se- 
quence clustering algorithm [28] using default parameters 
and a word size of n = 5, and a similarity cutoff of 97%. 
The reads that were uniquely mapped to each homeolog 
were selected and counted. The expression of rye and 
triticale cDNA reads in 454 libraries was normalized to 
reads per kilobase of gene length per million reads to cor- 
rect the biases from differences in the gene lengths and to 
normalize the expression among libraries of different sizes. 
To characterize expression in stem tissue, cDNA libraries 
were made from three genotypes sampled at two times of 
development. Analysis by two-way ANOVA showed no 
significant differences in expression for caleosins among 
genotypes or between times of development; these six li- 
braries were therefore used as replicates, and the data was 
analyzed by one-way ANOVA. The number of replicates 
for each of the other tissues is listed in Additional file 1: 
Table SI and Additional file 2: Table S2. 

The relative level of tissue-specific expression for caleo- 
sin gene family members in wheat and in barley was deter- 
mined by analysis of datasets from a 61 K [29], and a 22 K 
[30] Aflymetrix microarray, respectively, available at the 
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PLEXdb database [31]. The relative level of gene expres- 
sion in triticale in response to osmotic stress given as PEG 
treatments was analyzed by comparison of 454 data sets 
from treated and control plants. The changes in gene ex- 
pression in rye roots in response to salt treatment were 
analyzed from 454 sequence libraries with the CD -HIT al- 
gorithm [28] . The effect of cold stress on the expression of 
caleosin family members was analyzed in the microarray 
data set of Monroy et al. [32] .The effect of drought stress 
on caleosins were analyzed with the microarray data of 
Aprile et al. [33], available at the PLEXdb database [31]; 
the effect of ABA treatment in barley was analyzed with 
the microarray data set of Rodriguez et al. (GEO Acces- 
sion: GSE10328), also from the PLEXdb database. The 
identifiers for Clo genes on the Affymetrix microarrays 
mentioned above can be found in Additional file 4. 

Statistical analysis 

A two-way ANOVA was used to test for significant differ- 
ences in levels of caleosin gene expression among geno- 
types, or between developmental stages of triticale stem 
and anther tissues. The statistical significance of the differ- 
ence in the levels of expression among the different caleo- 
sin gene family members in each rye and triticale tissue 
was tested by a one-way ANOVA (p < 0.05). Duncans 
multiple range test was used to determine which caleosin 
genes expressed differ significantly within each tissue type. 
A x2 contingency test was also used to test the signifi- 
cance of the difference between the level of expression of 
the R genome copy of caleosin genes in rye and in triticale, 
based on the null hypothesis that the level of individual R 
genome caleosin transcripts in triticale was not different 
than one third of the level of expression of the genes in 
rye. One-way ANOVA was used to test the significance of 
the differences in caleosin gene expression from micro- 
array data. Duncan s multiple range test was used to deter- 
mine which caleosin genes expressed differ significantly 
across a panel of different tissues analyzed. Duncans test 
was also used to determine which caleosin genes expressed 
in barley crown tissue under control or ABA treatment 
conditions differ significantly. Note that the Affymetrix 
wheat microarray data does not distinguish between the 
homeologs within paralogous groups. 

Results 

Caleosin genes in 7. aestivum 

CAP3 assembly parameters ranging from 80% (default) 
to 99% identity were evaluated for the assembly of gene 
family members from hexaploid wheat and the optimal 
value was determined to be 98%. Assembly at a lower 
minimum percent identity resulted in contigs that in- 
cluded sequences from different homeologous gene cop- 
ies. Assembly at 99% minimum identity resulted in more 



numerous and shorter contigs with more independent 
contigs for the same gene. 

Among the species surveyed, hexaploid T. aestivum 
had the largest caleosin gene family with 11 gene family 
members per haploid genome. In total, 34 full length 
caleosin-like cDNA sequences were identified in this 
species (Table 1; Additional file 5: Figure S2). All but 
three of these transcript sequences were compiled from 
the wheat EST database at NCBI. Three additional se- 
quences {Clo4-B, ClolO-2-D, Cloll'A) were obtained 
from the WSS derived from second generation se- 
quences from genomic DNA. ClolO-2-D had a single 
supporting EST sequence in the GenBank EST database, 
and Clo 11 -A was supported by a single read from the 
triticale 454-cDNA data set. Clo4-B was identified in the 
WSS genomic database and the full coding region was 
verified by several reads from the triticale 454-cDNA 
data set, though Clo4-B from triticale has a 3 base pair 
deletion relative to Clo4-B from T. aestivum. Six cDNA 
sequences were confirmed by sequencing individual 
cDNA clones. Pair-wise BLASTn and ClustalW2 [34] 
comparison among the sequences identified 11 paralogous 
sets of three genes corresponding to homeologous group- 
ings of genes derived from the three ancestral genomes of 
wheat (Additional file 5: Figure S2). 

The accuracy of each contig sequence was judged to 
be excellent, as the contigs generally had a minimum 
depth of at least three ESTs and many had a depth of 
four to ten sequences. There was also a very good agree- 
ment between the contigs assembled from EST se- 
quences, sequences of individual cDNA clones and the 
WSS assemblies, which indicates the high accuracy of 
the WSS database. Genes within homeologous groups 
had high nucleotide sequence similarity, ranging from 
96% to 97% nucleotide sequence identity within the cod- 
ing region. This high degree of similarity is common 
among homeologous copies of genes in wheat [35]. The 
WSS sequences are derived from shotgun 454 sequen- 
cing of chromosome arm specific BAG libraries, thus se- 
quences have a chromosomal assignment, but not map 
location along the chromosomes. In nearly all cases, 
homeologous copies of genes were located on the same 
arm of homeologous chromosomes. One exception is 
C/o9. The A and D genome copies of Clo9 are located 
on the short arm and long arm of chromosome 4A and 
4D, which are considered homeologous, the B genome 
copy is located on the short arm of 4B which is not 
homeologous to the other two chromosome arms [36]. 
Ten of the 11 paralogous groups had three homeologous 
copies, but the caleosin 10 had four gene copies, one on 
each of the long arms of chromosome 2A and 2B and 
two copies identified on chromosome 2DL. These two 
copies of Clo 10 on 2DL had 93% nt sequence identity 
within the coding region, which is somewhat lower than 
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Table 1 Caleosin genes from four species^ 










T. aestivum 




nuiTiuiuy ucb 


Gene 


Chromosome 


Nucleotide 




R Hi^tnrhvnn^ H \fiilnntP^ O ^nti\/n flnnnnirn)^ 


Clol 


2AL 


1600 


302 


Bradi5gl5410 gi 34538472 gi297723297 


Col 


2BL 


1154 


302 




Clol 


2DL 


1178 


302 




Clo2 


2AL 


963 


244 


BradiSg 15427 gi|34538476 


ao2 


2BL 


986 


245 


Bradi5gl5417 


Clo2 


2DL 


877 


245 




Clo3 


6AS 


907 


217 


Bradilg70390 


Clo3 


6BS 


818 


215 




Clo3 


6DS 


793 


220 




Clo4 


SAL 


1039 


217 


Bradilg70400 gill 5467408 


Clo4 


3B 


1075 


217 


Bradilg44207 


Clo4 


SDL 


1084 


217 




Clo5 


6AL 


1053 


229 


Bradi3g56810 gi|6900307 


Cos 


6BL 


1051 


224 




Cos 


6DL 


1150 


222 




Clo6 


6AL 


827 


216 


Bradi3g56820 gi 151420803 gill 5467410 


Clo6 


6BL 


735 


205 




Clo6 


6DL 


815 


214 




Clo7 


7AS 


827 


198 


Bradilg44200 gi|l 5141 9867 gil 15448521 


Clo7 


7BS 


917 


213 




Clo7 


7DS 


958 


212 




Clo8 


7AS 


1102 


214 


gi|326515641 


ao8 


7BS 


962 


213 




Clo8 


7DS 


939 


215 




Clo9 


4AS 


994 


232 


Bradilg69571 gil51426143 


Clo9 


4BS 


1129 


235 




Clo9 


4DL 


1164 


233 




ClolO 


2AL 


907 


234 


gi|326490092 gill 5459382 


ClolO 


2BL 


866 


244 




ClolO 


2DL 


897 


243 




ClolO 


2DL 


836 


244 




Cloll 


2AL 


907 


234 


ViroBlast 


Clou 


2BL 


1062 
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^The full set of complete caleosin sequences for wheat and rye can be found in Additional file 16. The corrected Hv-C/o6 cDNA of H. vulgare sequence and 
Hv-C/o8 derived from ESTs and the Brachypodium sequence Bradi1g70400 with an extended CDS are included in Additional file 17. 

''Caleosin sequences for Brachypodium, H. vulgare, and 0. sativa, represent paralogous members of the gene family, and do not correspond to the individual 
homeologous gene copies found in wheat. 


the sequence identity among the homeologous copies on 
chromosomes 2A and 2B and one of the D copies, 
ClolO-2D, 

The degree of similarity among the paralogous se- 
quences of wheat spanned a wide range; the most simi- 
lar paralogs were ClolO and Cloll which shared 89% 
nucleotide sequence identity and 95% amino acid sequence 


similarity within the conserved 174 EF-hand domain. The 
most dissimilar members of the wheat caleosins gene family 
were ClolO and Clo4 which had 37% amino acid sequence 
identity and 57% similarity within the conserved domain. 
The size of the proteins encoded by members of the gene 
family ranged from 198 to 302 amino acids. All members 
of the gene family are characterized by the presence of a 
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single EF-hand calcium binding region of approximately 
174 amino acids and a predicted transmembrane domain 
of 20 amino acids. 

Caleosin genes In Secale cereale and Hordeum vulgare 

Ten full-length caleosin cDNA sequences in S. cereale 
(rye) were assembled from the Roche 454-cDNA se- 
quence set. Only Caleosin 2 was not identified in our rye 
sequence set, though a 239 nt fragment of a gene with 
97% sequence similarity to wheat Caleosin 2, is present 
in the whole genome shotgun sequence for rye [37] . This 
indicates that all 11 Clo genes are present in rye. The 
rye sequences showed between 90% and 99% sequence 
identity with their homologs in wheat within the coding 
region of each sequence. This high degree of similarity is 
common between wheat, barley, and rye, which are all 
members of the Triticeae tribe, as seen in the phylogen- 
etic diagram in Figure 1. 

Nine full length cDNA sequences of caleosin genes 
were identified in H. vulgare. Seven of these were ob- 
tained from the non-redundant database at GenBank, 
one sequence was assembled from ESTs, and one se- 
quence, Hv-C/o77, was tentatively derived from second 
generation sequencing of the barley genomic sequence. 
The latter sequence could not be confirmed independ- 
ently, but sequence differences between the other eight 
FL sequences from barley and the assembly 1-morex 
rcba, suggest Hv-C/oii derived from the viroblast high 
through-put database may contain some inaccuracies. A 
Clo3 ortholog was not identified in the barley sequence 
databases. 

Caleosin genes In Brachypodium distachyon 

Ten caleosin genes were identified in B. distachyon, a 
species for which the complete genome sequence is 
available. Two of the sequences were modified by ex- 
tending the ORE relative to the annotated sequence that 
is available for the Brachypodium genome. All contigs 
were verified as full-length by comparison to the wheat 
sequences; open reading frames and protein sequences 
were obtained. 

Conserved structural elements 

The conserved domain in caleosins was identified by 
Batch CD Search Tool on NCBI and InterPro Scan for 
each paralogous group from T, aestivum, H, vulgare, and 
B, distachyon. Results were verified by comparing with 
the conservation score calculated by aligning multiple 
sequences with ClustalW2 [34]. The conserved EE-hand 
calcium binding domain of 174 amino acids is common 
to all caleosins. This EE-hand motif is composed of 36 
amino acids with a calcium chelation loop and calcium 
ligating residues. The DGSLFE box which is the putative 
Casein Kinase Phosphorylation site is also highly 



conserved [2]. The alignment of all peptide sequences 
from T, aestivum, H, vulgare, and B, distachyon using 
ClustalW revealed the presence of other less conserved 
motifs, including the GS loop, transcription termination 
factor, and a trans-membrane domain. 

Intron-exon structure of caleosin genes In T, aestivum, 
H. vulgare, and Brachypodium 

The 24 caleosin genes in T. aestivum for which full gen- 
omic sequences were available all contained 6 exons and 
5 introns. Ten wheat caleosin genes for which partial 
coverage by the genomic contigs was available, showed 
the same intron-exon structure for the gene region that 
was covered. The details of intron and exon lengths 
for all available caleosins in T, aestivum are listed in 
Additional file 6: Table S3. Among the H, vulgare caleo- 
sin genes, five genes had six exons, one had four exons, 
one had five exons, and one had seven exons; two barley 
caleosins had only partial coverage by genomic contigs. 
In Brachypodium, seven caleosins had six exons, these 
include: Bradi5gl5410, Bradi5gl5417.1, Bradi5gl5427.1, 
Bradilg70390, Bradilg44207, Bradi3g56820.1, Bradilg44200, 
and Bradilg69571 (BradiSgl 5417.1 was miss-annotated as 
having 5 exons at the Gramene database); in contrast, 
Bradi3g56810 and Bradilg70400 have 5 exons; however, 
these may have incomplete annotation. A summary of 
available caleosin gene exons for barley and Brachypodium 
can be found in Additional file 7: Table S4. 

TIssue-speclflclty of rye and tritlcale caleosin paralogs 

The abundance of ESTs for ten rye caleosin genes as 
well as twenty-two genes from eleven paralogous groups 
assigned to the A and B subgenomes in triticale was inves- 
tigated to assess their relative expression level in more 
than 4.3 M and 7.1 M of 454 cleaned reads from four rye 
tissues (anther, crown, root, stem) and five triticale tissues 
(stigma, pollen, root, stem, anther), respectively. The data 
was normalized to take into account the sizes of the differ- 
ent tissue specific data sets and gene lengths, and is 
expressed as reads per kb of gene length per million 
454-cDNA reads (RPKM), and presented in Figures 2 
and 3, Additional file 8: Figure S3, and Additional file 9: 
Figure S4. 

Caleosin gene family members were detected in all tis- 
sues sampled, and most tissues showed the expression of 
several paralogous members of the gene family. Expres- 
sion of three caleosin members {CloS, Clo7, and Clo9) 
were detected in rye root tissue, with the expression of 
Clo9 being dominant, and CloS and Clo7 detected at 
very low levels (Figure 2). Most caleosin transcripts were 
detected in multiple tissues; however, Clo6 and Clo 10 
were only detected in anther tissue, with Clo6 detected 
at relatively high levels (Figure 2 and Additional file 8: 
Figure S3), demonstrating high tissue specificity for 
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Figure 1 Molecular phylogenetic analysis of the caleosin gene families of T. aestivum, H. vulgare, B. distachyon, S. cereale, O. sativa, 
and A. thaliana amino acid sequences. The evolutionary history was inferred by using the maximum lil<elihood method based on the Whelan 
and Goldman model [21]. The tree with the highest log likelihood (-3638.1820) is shown. The tree is drawn to scale, with branch lengths 
measured in the number of substitutions per site. The values on the tree represent bootstrap confidence values inferred from 100 replicates. 
Brachypodium gene identifiers are taken from http://brachypodium.org/ [13]; rice IDs are from GenBank and Arabidopsis IDs are from The 
Arabidopsis Information Resource (JAIR) [38]. Only one of each representative wheat caleosin homeologous groups was used in order to simplify 
the phylogenetic tree. The relationship among the wheat gene family members is shown in Additional file 5: Figure S2. 
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Figure 2 The relative level of expression of ten caleosin gene 
family members measured in a panel of rye tissues. The 

expression of caleosin gene family members was estimated in 454 
cDNA sequence libraries from anther, crown, root and stem rye 
tissues using RNA-seq analysis. The aligned 454-cDNAs to each 
caleosin member were counted, then normalized based on gene 
lengths and library depths using the RPKM method. Values are the 
total RPKM of two replicates. A one-way ANOVA was carried out to 
test the significance of the differences in caleosin gene expression in 
each rye tissue. Duncan's multiple range test was used to determine 
the significant differences in caleosin gene expression in each tissue. 
Rankings determined by Duncan's test (p < 0.05), are denoted by 
different letters, and are indicated on each bar in the graph. 



these caleosin members. In contrast, CloS and Clo7 were 
detected in all four tissues (Additional file 8: Figure S3). 

Triticale showed a similar diversity of expression of all 
gene family members. Among the five tissue types in- 
cluded in the analysis, anther tissue was sampled at two 
stages of development: UNM (uninucleate microspore) 



and TCP (tri-cellular pollen). A two-way ANOVA showed 
significant differences between expression of caleosin 
genes (p = 0.000), and also a significant interaction effect 
between gene and developmental stage (p = 0.000), in- 
dicating that there is a different level of caleosin gene 
expression at different stages of anther development. 
Expression of only two caleosin genes were detected in 
triticale pollen {Clo2, and C/06), and stigma tissues 
{Clo7 and CloS), and expression of five caleosin genes 
{Clo2, CloS, CI06, Clo7 and Clo9) were detected in root 
tissue, with Clo7 and Clo9 expression being dominant 
(Figure 3). Clo7 and Clo9 expression was also domin- 
ant in stem tissue, and C/o7 was found to show a broad 
tissue expression pattern, as it was detected in four of 
the tissues investigated (Additional file 9: Figure S4). In 
anthers, Clo7 showed high expression specifically at the 
UNM stage, whereas it was not detected at the TCP stage. 
Conversely, Cloll was observed to be primarily expressed 
at the TCP stage, and to a lesser extent at the UNM stage 
of anther development (Figure 3). Therefore, there is 
tissue specificity for certain caleosin members, as was 
observed with expression in rye tissues. Overall, these 
results demonstrate a very diverse pattern of tissue- 
specific expression of the caleosin genes. 

Tissue-specificity of wheat and barley caleosin paralogs 

Tissue-specificity of caleosin gene family members was 
also analyzed using independent data sets obtained from 
the PLEXdb database [31], and compared to the results 
for rye and triticale described above. Eight of the T. aesti- 
vum caleosins are represented on the wheat 61 K Affyme- 
trix microarray, similarly nine of the barley caleosins are 
represented on the 22 K Afiymetrix microarray. Gene ex- 
pression in multiple tissues and stages of development 
was assayed in T, aestivum by Schreiber et al. [29], and in 
barley by Druka et al. [30] . The wheat data shows a diver- 
sity of expression of caleosin family members in the tis- 
sues assayed, and appears to partially corroborate rye and 
triticale expression results (Additional file 10: Table S5). 
The wheat microarray data reveals that the Clo9 paralog is 
relatively highly expressed in root tissue (Additional 
file 10: Table S5), as was found in rye and triticale expres- 
sion data described above. The wheat microarray data also 
reveals relatively high expression of the Clo7 paralog 
across tissues, in agreement with the expression data for 
triticale. Both CI06 and Clo7 paralogs showed relatively 
high expression in anther tissue in the wheat microarray 
data, (Additional file 10: Table S5) which was similar to 
the observations in the 454 sequence data from both rye 
and triticale, although C/o7 was expressed significantly less 
than CI06 in rye anther (Figures 2 and 3). In contrast, 
Cloll was seen to be highly expressed in anther tissue of 
rye and triticale (Figures 2 and 3), but not in the wheat an- 
thers assayed by microarray analysis (Additional file 10: 
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Figure 3 The relative level of expression of eleven caleosin paralogs measured in a panel of triticale tissues. The expression of thirty-two 
caleosin gene family members was individually measured in 454 cDNA sequence libraries from anther, pollen, root, stem, and stigma triticale tissues 
using RNA-seq analysis. The aligned 454-cDNAs to each caleosin member were counted, then normalized based on gene lengths and library depths 
using the RPKM method. The expression of each paralog is subdivided into expression of each of the three homeologs, visualized as black bars for the 
R subgenome, grey bars for the B subgenome, and white bars for the A subgenome. C/o2 represented the expression of only the A and B homeologs. 
Values are the total RPKM of two replicates for stigma and pollen, two replicates each for UNM (uninucleate microspore) and TCP (tricellular pollen) 
anther tissues, and six replicates for stem. A one-way ANOVA was carried out to test the significance of global differences in caleosin gene expression 
in each triticale tissue. Duncan's multiple range test was used to determine the significant differences in caleosin gene expression in each tissue. 
Rankings determined by Duncan's test (p < 0.05) are denoted by different letters, and are indicated on each bar in the graph. Bars not labeled 
are ranked as 'a'. In stem tissue, the A, B, and R homeologues of C/o5 and Cloll are all ranked as 'ab', and the A and B homeologues of CI o4 are 
ranked as 'ab'. 



Table S5). Such differences in expression may be accounted 
for by differences in experimental design, stages of develop- 
ment, or may reflect species differences. 

The barley microarray data reveals that both Clo7 and 
Clo9 paralogs are relatively highly expressed in root tis- 
sue (Additional file 11: Table S6), which is in agreement 
with the 454 expression data for triticale (Figure 3). Clo9 
was also highly expressed in rye roots, but Clo7 was 
expressed at significantly lower levels in the same tissue 
(Figure 2). CI08 was also seen to be expressed at high 
levels in the roots of barley in the microarray data, but 
not in the roots of rye and triticale in the 454 sequence 
sets. CI06 and ClolO are highly expressed in barley an- 
ther tissue as well as in the anthers of both rye and triti- 
cale; however, 454 data for rye and triticale shows high 
expression of additional family members in anther tissue 



(Additional file 11: Table S6, Figures 2 and 3). Clo4 showed 
no significant expression in any of the tissues assayed in 
the barley microarray data which parallels results from rye 
and triticale (Additional file 11: Table S6, Figures 2 and 3). 
Some differences in expression between the barley micro- 
array data and the rye and triticale 454 datasets include 
high levels of CI08 in the crown of barley but not of rye 
(Additional file 11: Table S6, Figure 2). In addition there 
were differences in gene expression between the micro- 
array data of wheat and barley, for example, several caleo- 
sins showed significant expression in wheat pistils but not 
in the same tissue in barley. Overall, the wheat and barley 
array data show a diverse pattern of caleosin gene expres- 
sion in various tissues, and represent independent data 
sets that corroborate some of the tissue-specific caleosin 
gene expression patterns observed for rye and triticale. 
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The expression of caleosins under stress conditions 

Stress response experiments offer insight into changes in 
gene expression in response to environmental stresses. 
In response to osmotic stress administered as PEG treat- 
ment, CloSB, CloSB, CloSR, and ClollR were observed 
to be significantly reduced compared to the level of the 
control in osmotically stressed seedlings (Additional 
file 12: Table S7). Clo9 was seen to be induced about 
four fold in the roots of salt-stressed rye plants (Additional 
file 13: Table S8). In contrast, salt stress was not seen to 
alter the expression of caleosins in the roots of similarly 
treated triticale plants. Previously published microarray 
analysis also sheds light on the involvement of caleosins 
in the stress response. The microarray analysis of cold ac- 
climation in wheat by Monroy et al. [32] included probes 
for four caleosin family members, Clo2 to C/o5, and re- 
sults show Clo3 to be strongly induced in shoots by cold 
treatment of 4°C. Clo3 was induced within 6 h of treat- 
ment in winter wheat and maintained increased levels of 
expression up to 14 days of cold acclimation. CloS was 
also induced by cold treatment in spring wheat but 
showed a different profile of induction as seen by a sig- 
nificant G X T effect in a 2-way ANOVA (Additional 
file 14: Table S9). Two other microarray analyses looked 
at the expression of caleosin family members under stress 
conditions. One study by Aprile et al. [33] looked at the 
effect of drought stress on Triticum caleosin expression, 
and although this data set showed small changes in ex- 
pression in response to stress, they were found to be 
statistically non-significant. Another study looked at the 
ABA response in barley with the 22 k Affymetrix mi- 
croarray [31; GEO Accession: GSE10328], and results 
showed that both ClolO and Cloll had significant in- 
creases in expression due to ABA stress (Additional file 15: 
Table SIO). 

The effect of polyploidization on the expression of R 
subgenome caleosins 

Two tissue types, stem and anther, were sampled for 
both rye, and triticale. This facilitated the comparison of 
the R genome homeologs in both the rye and triticale 
genetic background, and examination of the effect of 
polyploidization on the expression of R gene copies. In 
stem tissue, the most striking differences between the 
two species was Cloll-R, which was the most highly 
expressed caleosin in rye stems, but was expressed at 
low levels in triticale as were the A and B genome copies 
of C/o7i (Figure 4A). Clo4-R was not detected in rye 
stem tissue but was expressed in moderate levels in triti- 
cale stems (Figure 4A), thus Cloll-R appears to have 
been suppressed and Clo4-R appears to have been acti- 
vated by the polyploidization event. In anthers, the rela- 
tive level of expression of most caleosin gene R 
homeologs was lower in triticale than in rye, and some 



rye genes showed especially marked differences in ex- 
pression in the two species. ClolO-R and CI06-R were 
highly expressed in rye, whereas ClolO-R was not de- 
tected in triticale, indicating a loss of expression in the R 
genome of triticale in anthers (Figure 4B). CI06-R also 
had very low levels of expression in triticale (Figure 4B). 
CloS'R was detected in rye anthers but not in triticale 
anthers (Figure 4B); however, the level of expression was 
relatively low in rye, thus loss of expression in triticale 
could not be assessed with confidence. 

Discussion 

The caleosin gene family 

The 11 paralogous groups of caleosin genes present in 
T. aestivum represents the largest gene family for caleo- 
sins among the seven species included in this analysis. 
Though the entire wheat genome has not yet been com- 
pletely sequenced, the description of the caleosin gene 
family appears to be complete or nearly so, since ortho- 
logs for all caleosins detected in rye, barley and Brachy- 
podium were also found in wheat. In addition, three 
homeologous copies for each paralogous group were de- 
tected in wheat as well as a fourth member of the ClolO 
group. All 32 of the wheat caleosin genes identified in 
the EST databases at GenBank were also identified in 
the WSS and two additional genes, Clo4-B and Cloll -A, 
that were not represented in the T. aestivum EST data- 
base at GenBank were identified in the WSS. This indi- 
cates that the depth of the WSS is very comprehensive. 
The wheat caleosin genes were also represented in the 
whole genome sequencing database (Genbank Whole- 
genome shotgun contigs database) [39]. Brachypodium, 
the most closely related monocot species with a com- 
pleted and annotated genome sequence available, has 
seven caleosin genes and rice has six caleosin genes. All 
caleosin genes in rice and Brachypodium have ortholo- 
gous sequences in wheat, barley and rye, as judged by 
sequence similarity, and seen in the phylogram in 
Figure 1. Since these more distantly related species have 
gene family members which share branches with se- 
quences from wheat throughout the tree, it seems that 
the representation of caleosins from wheat is likely 
complete. The larger number of caleosin genes in Triti- 
ceae, supported by identified sequences from wheat, rye 
and barley, and the representation of genes from Oryza 
on the major sub-branches in the phylogram indicate 
that the Triticeae had five gene duplications after the 
evolutionary separation from the Oryza, and before the 
separation of the three wheat progenitor species from 
each other. These duplications in the Triticeae lineage 
after the separation from rice are the CloS and Clo4 du- 
plication, the CloS and CI06 duplication, the Clo7 and 
CloS duplication, and the Clo2 and ClolO duplication, 
and subsequently the ClolO and Cloll duplication. The 
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Figure 4 A comparison of Co gene expression in rye and triticale, to assay the changes in gene expression as a result of 
polyploidization. The expression of caleosin gene family members was measured using RNA-seq analysis in the stem (A), and anther (B), of rye 
and triticale, respectively. The 454-cDNA sequences aligned to each caleosin member were counted, then normalized based on gene lengths and 
library depths using the RPKM method. The expression of each paralog is subdivided into expression of each of the three homeologs for triticale 
(J), visualized as black bars for the R subgenome, grey bars for the B subgenome, and white bars for the A subgenome. The expression of caleosin 
family members in rye (R) is represented by solid black bars. x2 contingency tests based on the null hypothesis that the level of individual R 
genome caleosins in triticale was not different than one third of the level of expression of the genes in rye were carried out. The "marks 
individual caleosins with p < 0.05, where the null hypothesis was not accepted. 



localization of these putative gene duplications on the 
same chromosomes supports this notion, since gene 
duplications often occur as tandem duplications [40], 
though further investigation would be required to dem- 
onstrate that the duplications were in tandem. Hv-C/o3 
was not identified in the barley data sets in GenBank; 
this may represent gene loss, loss of gene expression or 
may be due to incomplete transcriptome or genome se- 
quence from these species in the current databases. The 
lack of a full length sequence for the rye C/o2, but the 
presence of a gene sequence fragment for it in a high 
throughput sequence database is indicative of the ad- 
vanced but still incomplete state of the sequencing for 
these species. The length of the branches in Figure 1, are 
scaled according to amino acid differences between se- 
quences, thus providing an estimate of evolutionary dis- 
tance. The duplication of ClolO and Cloll appear to 
have happened more recently than other duplications, 
and the presence of ClolO and Cloll only in wheat, rye 



and barley provides supporting evidence for duplication 
only after the separation from the Brachypodium lineage, 
A similar evolutionary pattern is also observed for CloS 
and C/06, though that duplication appears to be older 
than the ClolO, Cloll duplication. 

In Brachypodium, Bd-Clol and two copies of Clo2 
(Bd-C/o2-A, and Bd'Clo2-B) ([13]; Bradi5gl5410, Bra- 
di5gl5417, and Bradi5gl5427) are identified as tandem 
duplications. In addition, Bd-CloS-A (Bradilg70390) and 
Bd-C/o4 (Bradilg70400) are also adjacent to each other. 
However Clo3-A and Clo3-B which share high sequence 
similarity and are both on chromosome 1, are not 
closely linked. Brachypodium provides evidence for both 
recent and old tandem duplications for caleosins as well 
as relatively recent non-tandem duplications. Arabidop- 
sis' caleosin genes are largely clustered together in the 
phylogenetic tree suggesting that gene duplication events 
occurred independently in the monocotyledonous and 
dicotyledonous branches of the tree. 
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Variation of caleosin gene expression in plant tissues 

The diverse expression profile of the caleosin genes in rye 
and triticale tissues, as well as in wheat and barley tissues, 
suggests that these calcium binding proteins likely play a 
broad role during plant development. Whereas some 
caleosins exhibited more restricted, tissue-specific patterns 
of expression, such as Clo6 and ClolO, others such as 
C/o7, were detected in almost all rye and triticale tissues 
sampled. It is therefore conceivable that some caleosins 
may have a more general, 'housekeeping' role in most 
plant tissues and cell types, whereas other caleosins may 
have tissue- and cell-type specific roles in signaling and 
regulation during plant development. Evidence that Clo3 
in wheat acts as a GAP for the a-subunit of the heterotri- 
meric G protein [1], raises the possibility that members of 
the gene family may play key roles in signaling. The ex- 
pression of other calcium-binding proteins in plants that 
function in all cell types or that have more restricted, 
tissue-specific functions, have been previously reported. 
For example, calmodulin, the predominant Ca^^ sensor, 
plays a critical role in decoding Ca^^ signatures into proper 
cellular responses in numerous tissue types and cellular 
compartments in eukaryotes [41]. Six members of the 
calmodulin-like protein gene family are expressed in a 
developmentally controlled pattern during nodulation 
in the roots of Medicago truncatula [42], and a kinesin- 
like calmodulin-binding protein, was found to be se- 
lectively expressed in the flowers, roots, and leaves of 
Arabidopsis [43]. 

Although r. aestivum CloS was found to be expressed 
in several triticale and rye tissues, there was no expres- 
sion in several tissues analyzed. Although these results 
are in agreement with previous work [2], demonstrating 
that the expression of At- C/o3, the ortholog of Ta-C/o3, 
was undetectable in Arabidopsis root using northern 
analysis, we have detected expression in Arabidopsis in 
response to abscisic acid treatment [unpublished obser- 
vations] using transgenic plants with promoteriGus gene 
constructs. This underscores the challenges of tissue- 
specific expression analysis, since expression can be reg- 
ulated developmentally, as well as by other factors such 
as environmental conditions and hormonal fluxes, and 
may not be identified in the tissue samples taken at a 
limited number of time points of development that are 
represented in the cDNA sequence databases. The in- 
duction and repression of several caleosin gene family 
members in response to salt stress, cold acclimation and 
osmotic stress implies a wide role for members of the 
gene family in the environmental stress response. These 
observations warrant a more in-depth analysis of the tis- 
sue specificity and the time course for the changes in 
gene expression. Further study of potential partners in 
protein-protein interaction is also warranted, since CloS 
of wheat has previously been shown to interact with 



both the a-subunit of the heterotrimeric G protein com- 
plex as well as members of the PI-PLC gene family [1]. 

Interestingly, the gene expression results demonstrate 
the effect of polyploidization on the expression of R subge- 
nome caleosins. The synthetic triticale is an intergeneric 
allohexaploid generated from Triticum durum and rye. 
The initial combination of two or more genomes in one 
organism may lead to considerable genomic reorganization 
and changes in gene expression relative to the parental 
species. Soon after polyploidization, triticale underwent a 
loss of its combined DNA content in the range of 22 to 
30% [44,45], and DNA elimination of repetitive DNA and 
low-copy sequences from the rye genome in triticale have 
been reported in molecular studies [46,47]. The effect of 
polyploidization was clearly observed in the case of several 
caleosins, such as the expression of the rye homeologs 
Clo6'R and ClolO-R, in triticale. The rye CI06-R and 
ClolO-R homeologs were found to be suppressed in an- 
thers, although expression of these caleosins were rela- 
tively high in the corresponding rye tissues. Tissue-specific 
silencing of homeologs from one of the genomes of 
polyploid species has been reported in other aUopoly- 
ploids including Tragopogon miscellus [48], and Gossy- 
pium hirsutum [49], though the mechanisms that lead 
to suppression are somewhat speculative at this point. 
Since these genes are expressed in other triticale tissues, 
gene deletion is clearly not the explanation for suppres- 
sion, and mechanisms related to chromatin remodelling 
or the incompatibility of signaling and regulation path- 
ways of parental genomes in newly derived polyploids 
warrants investigation. 

Conclusions 

An apparent full set of wheat caleosin gene sequences 
were acquired as full length cDNAs, the open reading 
frames were identified, and the peptide sequences were 
obtained. The gene sequences were confirmed with the 
WSS database. One member of the caleosin gene family, 
CloS, has previously been identified as a stress inducible 
gene encoding a Ca^^ binding protein that acts as a 
negative regulator of the a subunit of the heterotrimeric 
G protein GA3 [1]. The identification and full descrip- 
tion of the gene family for caleosins can be a significant 
step in further investigating the role of members of this 
gene family in signaling and regulation. The very diverse 
pattern of tissue-specific expression indicates a potential 
for a very broad role in signaling and regulation 
throughout plant development. 

Availability of supporting data 

The full set of full length caleosin cDNA sequences for 
wheat and rye can be found in Additional file 16. The cor- 
rected Hv-C/06 cDNA of H, vulgar e sequence and Hv- 
CI08 derived from ESTs and the Brachypodium sequence 
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Bradilg70400 with an extended CDS are included in 
Additional file 17. In addition, thirty one wheat caleosin 
sequences are being deposited at GenBank; seven of these 
are derived from direct sequencing of cDNA clones and 
have been deposited in the nr database with accession 
numbers HQ020505 and KJ523887 to KJ523892; 25 of 
these are derived from assembly of EST sequences and are 
being deposited at GenBank as Third Party Annotations 
(TPA). Five sequences are derived from triticale 454 EST 
libraries which are identical to T, aestivum caleosins and 
are being submitted to the Transcriptome Shotgun As- 
sembly (TSA) database. Three wheat caleosin mRNA (i.e. 
cDNA) equivalent sequences Ta-C/o4-5, Ta-C/o70-2-A 
Tdi'Cloll-A were derived from genomic sequences at the 
WSS database and do not qualify for deposit at GenBank. 
The ten Secale cereale caleosin sequences derived from 
454 cDNA sequences are being deposited in the TSA 
database at GenBank. The raw Secale cereale 454 se- 
quences have been deposited at the DNA Data Bank of 
Japan with identifier DRA000384. 

Additional files 



Additional file 16: The complete set of full length cDNA/mRNA 
sequences for caleosins from wheat and rye. 

Additional file 17: Corrected caleosin sequences from Hordeum 

vulgare, and Brachypodium distachyon. These versions of the 
sequences are not available in GenBank. 



Abbreviations 

GAP: GTPase activating protein; PI-PLC: Phosphoinositide-specific 
phospholipase C; RD20: Responsive to dehydration 20; RPKM: Reads per kb 
of gene length per million reads; UNM: Uninucleate microspore; 
TCP: Tricellular pollen; PEG: Polyethylene glycerol. 

Competing interests 

The authors declare that they have no competing interests. 
Authors' contributions 

HBK, SCB and UMP carried out gene sequence compilation, participated in 
the experimental design, and performed the experiments, and analyzed data. 
DM contributed to the writing and editing of the manuscript. AL completed 
the 454 sequencing. PJG designed the project, carried out gene sequence 
compilation and verification, and part of the 454 sequencing. UMP, HBK, SCB, 
DM, AL, and PJG contributed to writing and revision of the manuscript. 
All authors read and approved the final manuscript. 

Acknowledgements 

We would like to thank the International Wheat Genome Sequencing 
Consortium (IWGSC) for providing early access to the chromosome-based 
survey sequences. This work was supported by grants from the Natural 
Science and Engineering Research Council of Canada, and the Agricultural 
Bioproducts Innovation Program of Agriculture and Agri-Food Canada. We 
would like to thank David Walsh for advice in expression analysis and Ian 
Ferguson for advice on statistical analysis. 

Author details 

^Biology Department, Concordia University, 7141 Sherbrooke W, Montreal, 
QC H4B 1R6, Canada. ^Agriculture and Agri-Food Canada, Lethbridge 
Research Centre, 5403, 1st Avenue South, CP. 3000, TIJ 4B1 Lethbridge, AB, 
Canada. ^Present address: Department of Genetics, Faculty of Agriculture, 
Ain-Shams University, Shoubra El-khema, Cairo, Egypt 

Received: 28 March 2013 Accepted: 22 February 2014 
Published: 27 March 2014 

References 

1. Khalil HB, Wang Z, Wright JA, Ralevski A, Donayo AO, Gulick PJ: 
Heterotrimeric Ga subunit from wheat {Triticum aestivum), GA3, interacts 
with the calcium-binding protein, CloB, and the phosphoinositide- 
specific phospholipase C, PI-PLCl. Plant Mol Biol 201 1, 77:145-158. 

2. Takahashi S, Takeshi K, Yamaguchi-Shinozaki K, Shinozaki K: An Arabidopsis 
gene encoding a Ca2 -i- -binding protein is induced by abscisic acid 
during dehydration. Plant Cell Physiol 2000, 41:898-903. 

3. Aubert Y, Vile D, Pervent M, Aldon D, Ranty B, Simonneau T, Vavasseur A, 
Galaud JP: RD20, a stress-inducible caleosin, participates in stomatal 
control, transpiration and drought tolerance in Arabidopsis thaiiana. Plant 
Cell Physiol 2010, 51:1975-1987. 

4. Liu H, Hedley P, Cardie L, Wright KM, Hein I, Marshall D, Waugh R: 
Characterisation and functional analysis of two barley caleosins 
expressed during barley caryopsis development. Planta 2005, 
221:513-522. 

5. Chiang CJ, Che CJ, Lin LJ, Chang C, Chao YP: Selective delivery of cargo 
entities to tumor cells by nanoscale artificial oil bodies. J /\gr/c Food 
Chem 2010, 58:11695-702. 

6. Huang S, Sirikhachornkit A, Su X, Paris J, Gill B, Haselkorn R, Gornicki P: 
Genes encoding plastid acetyl-CoA carboxylase and 3-phosphoglycerate 
kinase of the Triticum/Aegilops complex and the evolutionary history of 
polyploid wheat. Proc Natl Acad Sci USA 2002, 99:81 33-8. 

7. Mergoum M, Singh PK, Pena RJ, Lozano-del Rio AJ, Cooper KV, Salmon DF, 
Gomez Macpherson H: Triticale: a 'new' crop with old challenges. Cereals 
Vol 3. Edited by Carena MJ. New York: Springer; 2009:267-290. 



Additional file 1: Table SI. Summary of rye cDNA libraries and derived 
454 reads. 

Additional file 2: Table S2. Summary of triticale cDNA libraries and 
derived 454 reads. 

Additional file 3: Figure SI. Workflow used to measure the abundance 
of caleosin gene family members in thirteen rye and triticale 454-cDNA 
libraries expressed in different tissues. 

Additional file 4: Identifiers for C/o genes on microarrays. 

Additional file 5: Figure S2. Molecular phylogenetic analysis of 
T. aestivunn caleosin nucleotide sequences by the maximum likelihood 
method. 

Additional file 6: Table S3. Intron-exon structure of caleosin genes in 
T. aestivunn. 

Additional file 7: Table S4. Caleosin gene exons in H. vulgare and 
Brachypodium. 

Additional file 8: Figure S3. The relative level of expression of ten 
caleosin gene family members in four different rye tissues based on 454 
sequencing. The data in this figure is a reorganized version of the data 
presented in Figure 2. 

Additional file 9: Figure S4. The relative level of expression of eleven 
caleosin gene family members in five different triticale tissues. The data 
in this figure is a reorganized version of the data presented in Figure 3. 

Additional file 10: Table S5. Microarray analysis of caleosin gene 
expression measured in a panel of thirteen Triticunn aestivunn tissues. 

Additional file 11: Table S6. Microarray analysis of caleosin gene 
expression measured in a panel of fifteen Hordeunn vulgare genotype 
Morex tissues. 

Additional file 12: Table S7. Caleosin expression in response to PEG 
treatment of triticale seedlings measured in 454 cDNA libraries. 

Additional file 13: Table S8. Comparison of caleosin expression in salt 
treated and control rye roots (RPKPM) measured in 454 cDNA libraries. 

Additional file 14: Table S9. Microarray analysis of the effect of cold 
treatment on caleosin gene expression in wheat. 

Additional file 15: Table S10. Microarray analysis of levels of expression 
of the caleosin paralogs in barley crown tissue under ABA treatment 



Khalil et al. BMC Genomics 2014, 15:239 Page 15 of 15 

httpy/www.biomedcentral.com/l 471 -21 64/1 5/239 



8. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ: Basic local alignment 
search tool. J Mol Biol 1990, 215:403-410. 

9. Huang X, Madan A: CAP3: A DNA sequence assembly program. Genome 
Res 1999, 9:868-77. 

10. ExPASy Translate Tool. In http://web.expasy.org/translate/. 

11. International Wheat Genome Sequencing Consortium. In [http://www. 
wheatgenome.org/rools-and-Resources] 

12. International Barley Sequencing Consortium. In http://webblast.ipk- 
gatersleben.de/barley/viroblast.php. 

13. Brachypodium database. In http://brachypodium.org/. 

14. NCBI Batch Conserved Domain Search tool. In [http://www.ncbi.nlm.nih. 
gov/Structure/bwrpsb/bwrpsb.cgi] 

15. RCSB Protein Data Bank. In http://www.rcsb.org/pdb/home/home.do. 

16. InterPro Scan Sequence Search. European Bioinformatics Institute, 2011. 
In http://www.ebi.ac.uk/rools/pfa/iprscan/. 

17. Simple Modular Architecture Research Tool (SMART). In http://smart. 
embl-heidelberg.de/. 

18. Tamura K, Peterson D, Peterson N, Stecher G, Nei M, Kumar S: MEGA5: 
Molecular Evolutionary Genetics Analysis using maximum likelihood, 
evolutionary distance, and maximum parsimony methods. Mol Biol Evol 
2011,28:2731-2739. 

19. Steel MA, Fu YX: Classifying and counting linear phylogenetic invariants 
for the Jukes-Cantor model. J Comput Biol 1995, 2:39-47. 

20. Edgar RC: MUSCLE: multiple sequence alignment with high accuracy and 
high throughput. Nucleic Acid Res 2004, 32(5):1 792-7. 

21. Whelan S, Goldman N: A general empirical model of protein evolution 
derived from multiple protein families using a maximum-likelihood 
approach. Mol Biol Evol 2001, 18:691-9. 

22. Gramene database. In [http://www.gramene.org, August 2013]. 

23. Iran F, Penniket C, Patel RV, Provart NJ, Laroche A, Rowland 0, Robert LS: 
Developmental transcriptional profiling reveals key insights into Triticeae 
reproductive development. Plant J 20]3, 74:971-988. 

24. Zadoks JC, Chang IT, Konzak CF: A decimal code for the growth stages of 
cereals. Weed Res 1974, 14:415-421. 

25. Gulick P, Dvorak J: Gene induction and repression by salt treatment in 
roots of the salinity-sensitive Chinese Spring wheat and the salinity- 
tolerant Chinese Spring x Elytrigia elongata amphiploid. Proc NotI Acod 
SciUSA 1987, 84(1):99-103. 

26. Comeau A, Nodichao L, Collin J, Baum M, Samsatly J, Hamidou D, Langevin 
F, Laroche A, Picard E: New approaches for the study of osmotic stress 
induced by polyethylene glycol (PEG) in cereal species. Cereal Res 
Commun 2010, 38:471-481. 

27. Goecks J, Nekrutenko A, Taylor J: Galaxy: a comprehensive approach for 
supporting accessible, reproducible, and transparent computational 
research in the life sciences. Genome Biol 2010, 11:R86. 

28. Fu L, Niu B, Zhu Z, Wu S, Li W: CD-HIT: accelerated for clustering the 
next-generation sequencing data. Bioinformatics 2012, 28:3150-2. 

29. Schreiber AW, Sutton T, Caldo RA, Kalashyan E, Lovell B, Mayo G, 
Muehlbauer GJ, Druka A, Waugh R, Wise RP, Langridge P, Baumann U: 
Comparative transcriptomics in the Triticeae. BMC Genomics 2009, 10:285. 

30. Druka A, Muehlbauer G, Druka I, Caldo R, Bauman U, Rostoks N, Schreiber A, 
Wise R, Close T, Kleninhofs A, Graner A, Schulman A, Langridge P, Sato K, 
Hayes P, McNicol J, Marshall D, Waugh R: An atlas of gene expression from 
seed to seed through barley development. Funct Integr Genomics 2006, 
6:202-211. 

31. PLEXdb database. In [http://www.plexdb.org] 

32. Monroy AF, Dryanova A, Malette B, Oren DH, Ridha Farajalla M, Liu W, 
Danyluk J, Ubayasena LW, Kane K, Scoles GJ, Sarhan F, Gulick PJ: Regulatory 
gene candidates and gene expression analysis of cold acclimation in 
winter and spring wheat. Plant Mol Biol 2007, 64(4):409-23. 

33. Aprile A, Mastrangelo AM, De Leonardis AM, Galiba G, Roncaglia E, Ferrari F, 
De Bellis L, Turchi L, Giuliano G, Cattivelli L: Transcriptional profiling in 
response to terminal drought stress reveals differential responses along 
the wheat genome. BMC Genomics 2009, 10:279. 

34. ClustalW2- Multiple Sequence Alignment. In [www.ebi.ac.uk/rools/msa/ 
clustalw2^ 

35. Farajalla R, Gulick PJ: The alpha-tubulin gene family in wheat {Triticum 
aestivum L) and differential gene expression during cold acclimation. 

Genome 2007, 50(5):502-10. 

36. Miftahudin RK, Ma XF, Mahmoud AA, Layton J, Milla MA, Chikmawati T, 
Ramalingam J, Peril 0, Pathan MS, Momirovic GS, Kim S, Chema K, Fang P, 



37. 



38. 
39. 



40. 



41. 



42. 



43. 



44. 



45. 



46. 



47. 



49. 



Haule L, Struxness H, Birkes J, Yaghoubian C, Skinner R, McAllister J, Nguyen 
V, Qi LL, Echalier B, Gill BS, Linkiewicz AM, Dubcovsky J, Akhunov ED, Dvorak 
J, Dilbirligi M, Gill KS, Peng JH, et al: Analysis of expressed sequence tag 
loci on wheat chromosome group 4. Genetics 2004, 168:651-663. 
Riaho-Pachon DM, Nagel A, Neigenfind J, Wagner R, Basekow R, Weber E, 
Mueller-Roeber B, Diehl S, Kersten B: GabiPD: the GABI primary 
database - a plant integrative 'omics' database. Nucleic Acids Res 2009, 
37(Database issue):D954-9. 

The Arabidopsis Information Resource (TAIR). In [http//www.arabidopsis.org/^ 

Brenchley R, SpannagI M, Pfeifer M, Barker GL, D'Amore R, Allen AM, 

McKenzie N, Kramer M, Kerhornou A, Bolser D, Kay S, Waite D, Trick M, 

Bancroft I, Gu Y, Huo N, Luo MC, Sehgal S, Gill B, Kianian S, Anderson 0, 

Kersey P, Dvorak J, McCombie WR, Hall A, Mayer KF, Edwards KJ, Bevan MW, 

Hall N: Analysis of the bread wheat genome using whole-genome 

shotgun sequencing. Nature 2012, 491 (7426):705-10. 

Arabidopsis Genome Initiative: Analysis of the genome sequence of the 

flowering plant Arabidopsis ttioliona. Nature 2000, 408:796-815. 

Kim MC, Chung WS, Yun DJ, Cho MJ: Calcium and calmodulin-mediated 

regulation of gene expression in plants. Mol Plant 2009, 2:13-21. 

Liu J, Miller SS, Graham M, Bucciarelli B, Catalano CM, Sherrier DJ, Samac DA, 

Ivashuta S, Fedorova M, Matsumoto P, Gantt JS, Vance CP: Recruitment of 

novel calcium-binding proteins for root nodule symbiosis in Medicago 

truncatuia. Plant Physiol 2006, 141 :1 67-1 77. 

Reddy AS, Narasimhulu SB, Safadi F, Golovkin M: A plant kinesin heavy 

chain-like protein is a calmodulin-binding protein. Plant J 1996, 10:9-21. 

Boyko EV, Badaev NS, Maximov NG, Zelenin AV: Regularities of genome 

formation and organization in cereals: I. DNA quantitative changes in 

the process of allopolyploidization. Genetika 1988, 24:89-97. 

Bennett MD, Leitch IJ: Nuclear DNA amounts in angiosperms: targets, 

trends and tomorrow. Annal Bot 201 1, 107:467-590. 

Ma X-F, Pang P, JP Gustafson JP: Polyploidization-induced genome 

variation in triticale. Genome 2004, 47:839-848. 

Ma X-F, Gustafson JP: Timing and rate of genome variation in triticale 

following allopolyploidization. Genome 2006, 49:950-958. 

Buggs RJA, Doust AN, Tate JA, Koh J, Soltis K, Feltus FA, Paterson AH, 

Soltis PS, Soltis DE: Gene loss and silencing in Tragopogon misceiius 

(Asteraceae): comparison of natural and synthetic allotetraploids. Heredity 

2009, 103:73-81. 

Adams KL, Cronn R, Percifield R, Wendel JF: Genes duplicated by 
polyploidy show unequal contributions to the transcriptome and 
organ-specific reciprocal silencing. Proc Natl Acad Sci USA 2003, 

100:4649-4654. 



doi:1 0.1 1 86/1 471 -21 64-1 5-239 

Cite this article as: Khalil et al.: Characterization of the caleosin gene 
family in the Triticeae. BMC Genomics 2014 15:239. 



Submit your next manuscript to BioMed Central 
and take full advantage of: 

• Convenient online submission 

• Thorough peer review 

• No space constraints or color figure charges 

• Immediate publication on acceptance 

• Inclusion in PubMed, CAS, Scopus and Google Scholar 

• Research which is freely available for redistribution 



Submit your manuscript at 
www.biomedcentral.com/submit 



o 



BioMed Central 



